Complexity-effective superscalar embedded processors using instruction-level distributed processing

نویسنده

Ian Michael Caulfield

چکیده

Modern trends in mobile and embedded devices require ever increasing levels of performance, while maintaining low power consumption and silicon area usage. This thesis presents a new architecture for a high-performance embedded processor, based upon the instruction-level distributed processing (ILDP) methodology. A qualitative analysis of the complexity of an ILDP implementation as compared to both a typical scalar RISC CPU and a superscalar design is provided, which shows that the ILDP architecture eliminates or greatly reduces the size of a number of structures present in a superscalar architecture, allowing its complexity and power consumption to compare favourably with a simple scalar design. The performance of an implementation of the ILDP architecture is compared to some typical processors used in high-performance embedded systems. The effect on performance of a number of the architectural parameters is analysed, showing that many of the parallel structures used within the processor can be scaled to provide less parallelism with little cost to the overall performance. In particular, the size of the register file can be greatly reduced with little average effect on performance – a size of 32 registers, with 16 visible in the instruction set, is shown to provide a good trade-off between area/power and performance. Several novel developments to the ILDP architecture are then described and analysed. Firstly, a scheme to halve the number of processing elements and thus greatly reduce silicon area and power consumption is outlined but proves to result in a 12–14% drop in performance. Secondly, a method to reduce the area and power requirements of the memory logic in the architecture is presented which can achieve similar performance to the original architecture with a large reduction in area and power requirements or, at an increased area/power cost, can improve performance by approximately 24%. Finally, a new organisation for the register file is proposed, which reduces the silicon area used by the register file by approximately three-quarters and allows even greater power savings, especially in the case where processing elements are power gated. Overall, it is shown that the ILDP methodology is a viable approach for future embedded system design, and several new variants on the architecture are contributed. Several areas of useful future research are highlighted, especially with respect to compiler design for the ILDP paradigm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific

Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...

متن کامل

C-slow Technique vs Multiprocessor in designing Low Area Customized Instruction set Processor for Embedded Applications

The demand for high performance embedded processors, for consumer electronics, is rapidly increasing for the past few years. Many of these embedded processors depend upon custom built Instruction Ser Architecture (ISA) such as game processor (GPU), multimedia processors, DSP processors etc. Primary requirement for consumer electronic industry is low cost with high performance and low power cons...

متن کامل

Scalable Vector Processors for Embedded Systems

متن کامل

Complexity Effective ASIP Architectures for Network Processing and Multimedia Acceleration

xiii 1 Processor Design 1 1.1 Technology Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Application Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Choice of Implementation Platforms . . . . . . . . . . . . . . . . . . . . . . 7 1.4 ASIP Design Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Complexity Effective Desi...

متن کامل

Evaluating Compiler Support for Complexity Effective Network Processing

Statically scheduled processors are known to enable low complexity hardware implementations that lead to reduced design and verification time. However, statically scheduled processors are critically dependent on the compiler to exploit instruction level parallelism and deliver higher performance. In order to ascertain the suitability of statically scheduled processors for network processing (wh...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Complexity-effective superscalar embedded processors using instruction-level distributed processing

نویسنده

چکیده

منابع مشابه

For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific

C-slow Technique vs Multiprocessor in designing Low Area Customized Instruction set Processor for Embedded Applications

Scalable Vector Processors for Embedded Systems

Complexity Effective ASIP Architectures for Network Processing and Multimedia Acceleration

Evaluating Compiler Support for Complexity Effective Network Processing

عنوان ژورنال:

اشتراک گذاری